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Abstract 

Two parties observing correlated random variables seek to run an interactive communication protocol. 
How many bits must they exchange to simulate the protocol, namely to produce a view with a joint 
distribution within a fixed statistical distance of the joint distribution of the input and the transcript 
of the original protocol? We present an information spectrum approach for this problem whereby the 
information complexity of the protocol is replaced by its information complexity density. Our single¬ 
shot bounds relate the communication complexity of simulating a protocol to tail bounds for information 
complexity density. As a consequence, we obtain a strong converse and characterize the second-order 
asymptotic term in communication complexity for independent and identically distributed observation 
sequences. Furthermore, we obtain a general formula for the rate of communication complexity which 
applies to any sequence of observations and protocols. Connections with results from theoretical computer 
science and implications for the function computation problem are discussed. 


I. Introduction 

Two parties observing random variables X and Y seek to run an interactive protocol n with inputs X 
and Y. The parties have access to private as well as shared public randomness. What is the minimum 
number of bits that they must exchange in order to simulate tt to within a fixed statistical distance el 
This question is of importance to the theoretical computer science as well as the information theory 
communities. On the one hand, it is related closely to the communication complexity problem lf53l . 
which in turn is an important tool for deriving lower bounds for computational complexity ll27l and for 
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space complexity of streaming algorithms On the other hand, it is a significant generalization of 
the classical information theoretic problem of distributed data compression 11431 . replacing data to be 
compressed with an interactive protocol and allowing interactive communication as opposed to the usual 
one-sided communication. 

In recent years, it has been argued that the distributional communication complexity for simulating a 
protocoQ 7 r is related closely to its information complexity^ IC(7r) defined as follows: 

IC(tt) = /(n A X\Y) + /(II A Y\X), 

where I(X AY\Z) denotes the conditional mutual information between X and Y given Z (cf . Il44l . l[T3l l). 
For a protocol 7r with communication complexity 17r| (the depth of the binary protocol tree), a simulation 
protocol requiring <5(y IC(/r)|7r|) bits of communication was given in j4j and one requiring 2° , ' IC( ' 7T) ' > 
bits of communication was given in iTTOl . A general version of the simulation problem was considered 
in 1551 . but only bounded round simulation protocols were considered. Interestingly, it was shown in 
ll8il that the amortizecQ distributional communication complexity of simulating n copies of a protocol 
7T for vanishing simulation error is bounded above b\Q IC(7r). While a matching lower bound was also 
derived in El, it is not valid in our context - HI considered function computation and used a coordinate- 
wise error criterion. Nevertheless, we can readily modify the lower bound argument in HI and use the 
continuity of conditional mutual information to formally obtain the required lower bound and thereby a 
characterization of the amortized distributional communication complexity for vanishing simulation error. 
Specifically, denoting by D(n n ) the distributional communication complexity of simulating n copies of 
a protocol 7r with vanishing simulation error, we have 

lim — D(ir n ) = IC(7r). 

ra—>oo n 

Perhaps motivated by this characterization, or a folklore version of it, the research in this area has focused 
on designing simulation protocols for it requiring communication of length depending on IC(-7r); the 
results cited above belong to this category as well. However, the central role of IC(7r) in the distributional 
communication complexity of protocol simulation is far from settled and many important questions remain 

'The difference between simulation and compression of protocols is significant and is discussed in Remark [ 2 ] below. 

2 For brevity, we do not display the dependence of IC(7r) on the (fixed) distribution P.vy. 

^Throughout the paper, ’’amortized” indicates that the observations are independently identically distributed (IID) and the 
protocol to be simulated is n copies of the same protocol. 

4 Braverman and Rao actually used their general simulation protocol as a tool for deriving the amortized distributional 
communication complexity of function computation. This result was obtained independently by Ma and Ishwar in 1211 using 
standard information theoretic techniques. 
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unanswered. For instance, (a) does IC( 7 r) suffice to capture the dependence of distributional communi¬ 
cation complexity on the simulation error e? (b) Does information complexity have an operational role 
in simulating 7 r" besides being the leading asymptotic term? (c) How about the simulation of more 
complicated protocols such as a mixture 7 r mix of two product protocols tt" and tt?,' - does IC( 7 r mix ) still 
constitute the leading asymptotic term in the communication complexity of simulating 7r mix ? 

The quantity IC( 7 r) plays the same role in the simulation of protocols as H(X) in the compression 
of X n fi4ll and H(X\Y) in the transmission of X n by the first to the second party with access to 
Y n (451 . The questions raised above have been addressed for these classical problems (cf. (22}). In this 
paper, we answer these questions for simulation of interactive protocols. In particular, we answer all 
these questions in the negative by exhibiting another quantity that plays such a fundamental role and 
can differ from information complexity significantly. To this end, we introduce the notion of information 
complexity density of a protocol -k with inputs X and Y generated from a fixed distribution Pxy- 


Definition 1 (Information complexity density). The information complexity density of a private coin 
protocol tt is given by the function 


i c(T\x,y) 


, ^n\XY(r\x,y) 

log —5 - 7 | x 

p ii|.y {t\x) 


+ log 


p ii\xy (r\x,y) 
p n|Y (r\y) 


for all observations x and y of the two parties and all transcripts r, where PnxY denotes the joint 
distribution of the observation of the two parties and the random transcript n generated by tt. 


Note that IC( 7 r) = E [ic(n; X , Y)]. We show that it is the e-tail of the information complexity density 
ic(n; X,Y), i.e., the supremun^] over values of A such that Pr (ic(n; X, Y) > A) > e, which governs 
the communication complexity of simulating a protocol with simulation error less than e and not the 
information complexity of the protocol. The information complexity IC(/ 7 r) becomes the leading term in 
communication complexity for simulating tt only when roughly 

IC(tt) y/Var(ic(n; X, Y)) log(l/e). 


This condition holds, for instance, in the amortized regime considered in HI- However, the e-tail of 
ic(n;X, Y) can differ significantly from IC( 7 r), the mean of icfH; A', V). In Appendix [aJ we provide 
an example protocol with inputs of size 2 n such that for e = 1/n 3 , the e-tail of ic(n;Af, Y) is greater 
than 2 n while IC( 7 r) is very small, just 0{n ~ 2 ). 


5 Formally, our lower bound uses lower e-tail sup{A : Pr (ic(II; X, Y) > A) > e} and the upper bound uses upper e-tail 
inf{A : Pr(ic(II;X, Y) > A) < e}. For many interesting cases, the two coincide. 
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A. Summary of results 

Our main results are bounds for distributional communication complexity D e (7r) for e-simulating a 
protocol 7 r. The key quantity in our bounds is the £-tail A e of ic(II; X , Y). 

Lower bound. Our main contribution is a general lower bound for D e ( 7 r). We show that for every 
private coin protocol 7 r, D e ( 77 ) > A e . In fact, this bound does not rely on the structure of random variable 
II and is valid for the more general problem of simulating a correlated random variable. 

Prior to this work, there was no lower bound that captured both the dependence on simulation error e as 
well as the underlying probability distribution. On the one hand, the lower bound above yields many sharp 
results in the amortized regime. It gives the leading asymptotic term in the communication complexity 
for simulating any sequence of protocols, and not just product protocols. For product protocols, it yields 
the precise dependence of communication complexity on e as well as the exact second-order asymptotic 
term. On the other hand, it sheds light on the dependence of D e (zr) on e even in the single-shot regime. 
For instance, our lower bound can be used to exhibit an arbitrary separation between D £ ( 7 r) and IC(7r) 
when e is not fixed. Specifically, consider the example protocol in Appendix [A] On evaluating our lower 
bound for this protocol, for £ = 1/n 3 we get D £ ( it) = fl(n) which is far more than 2 IC M since 
IC(tt) = Oiri -). Remarkably, l [2~il . lf20ll exhibited exponential separation between the distributional 
communication complexity of computing a function and the information complexity of that function 
even for a fixed £, thereby establishing the optimality of the upper bound D £ ( 77 ) < 0(2 IC (7r)) given 
in ifTOH . Our simple example shows a much stronger separation between D e ( 77 ) and IC(7r), albeit for a 
vanishing e. 

Upper bound. To establish our asymptotic results, we propose a new simulation protocol, which is of 
independent interest. For a protocol 7 r with bounded rounds of interaction, using our proposed protocol 
we can show that D £ ( 7 r) < A e . Much as the protocol of (H, our simulation protocol simulates one round 
at a time, and thus, the slack in our upper bound does depend on the number of rounds. 

Note that while the operative term in the lower bound and the upper bound is the £-tail of ic(II; X , Y), 
the lower bound approaches it from below and the upper bound approaches it from above. It is often 
the case that these two limits match and the leading term in our bounds coincide. See Figure [T] for an 
illustration of our bounds. 

Amortized regime: second-order asymptotics. Denote by n" the n-fold product protocol obtained by 
applying 7r to each coordinate (X r , Y t ) for inputs X n and Y n . Consider the communication complexity 
D £ (TT n ) of £-simulating 7r n for independent and identically distributed (HD) (X n ,Y n ) generated from 
P xy- Using the bounds above, we can obtain the following sharpening of the results of HI: With V(7r) 
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Distribution of ic(II;X, Y) 



Fig. 1: Illustration of lower and upper bounds for D e (it) 

denoting the variance of ic(II;2f, Y), 

D £ ('K n ) = nIC(7r) + \J nV(7r)Q _1 (e) + o(y/n), 

where Q(x) is equal to the probability that a standard normal random variable exceeds x and Q~ 1 (s) ~ 
y/log(l/e). On the other hand, the arguments irj^] |8l or lf55ll give us 

D e (it n ) > nIC(7r) - ne[|7r| + log |A’||>’|] - elog(l/e). 

But the precise communication requirement is not less but \/nV(7r) log(l/e) more than riLQ(it). 

General formula for amortized communication complexity. The lower and upper bounds above 
can be used to derive a formula for the first-order asymptotic term, the coefficient of n, in D s (n n ) for 
any sequence of protocols 7r n with inputs X n E X n and Y n E y n generated from any sequence of 
distributions P x n Y n - We illustrate our result by the following example. 

Example 1 (Mixed protocol). Consider two protocols 7r h and 7r t with inputs X and Y such that IC(7r h ) > 
IC(7r t ). For n IID observations (X n , Y n ) drawn from Pxy, we seek to simulate the mixed protocol 7r mix , n 
defined as follows: Party 1 first flips a (private) coin with probability p of heads and sends the outcome 
Flo to Party 2. Depending on the outcome of the coin, the parties execute 7r h or 7r t n times, i.e., they 
use 7r^ if Flo = h and 7r” if Flo = t. What is the amortized communication complexity of simulating the 
mixed protocol 7r miXjn ? Note that 

IC(7r miXi n) = n [pIC(7T h ) + (1 - p)lC(7Tt)] . 

Is it true that in the manner of ||8]] the leading asymptotic term in D £ (n miXin ) is IC(7r mix il )? In fact, it is 
not so. Our general formula implies that for all p E (0,1), 

£>e(7r mix ,n) = ™IC(7T h ) + o(n) 

6 The proof in (8) uses the inequality IC(7r) < 17r|, a multiparty extension of which is available in 8131 . [32| . 
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This is particularly interesting when p is very small and IC(7r h ) IC(7r t ). 

B. Proof techniques 

Proof for the lower bound. We present a new method for deriving lower bounds on distributional 
communication complexity. Our proof relies on a reduction argument that utilizes an e-simulation to 
generate an information theoretically secure secret key for X and Y (for a definition of the latter, see 
If33l , Iffll or Section [TV|. Heuristically, a protocol can be simulated using fewer bits of communication than 
its length because of the correlation in X and Y. Due to this correlation, when simulating the protocol, 
the parties agree on more bits (generate more common randomness ) than what they communicate. These 
extra bits can be extracted as an information theoretically secure secret key for the two parties using 
the leftover hash lemma ( cf. |[6]], ||43ll). A lower bound on the number of bits communicated can be 
derived using an upper bound for the maximum possible length of a secret key that can be generated 
using interactive communication; the latter was derived recently in lf50ll . lf5Tl . 

Protocol for the upper bound. We simulate a given protocol one round at a time. Simulation of each 
round consists of two subroutines: Interactive Slepian-Wolf compression and message reduction by public 
randomness. The first subroutine is an interactive version of the classical Slepian-Wolf compression fi31 
for sending X to an observer of Y which is of optimal instantaneous rate. The second subroutine uses 
an idea that appeared first in iBTI (see, also, 031 . If54l0 and reduces the number of bits communicated in 
the first by realizing a portion of the required communication by the shared public randomness. This is 
possible since we are not required to recover a given random variable II, but only simulate it to within 
a fixed statistical distance. 

The proposed protocol is closely related to that in OH. However, there are some crucial differences. The 
protocol in jH, too, uses public randomness to sample each round of the protocol, before transmitting 
it using an interactive communication of size incremented in steps. However, our information theoretic 
approach provides a systematic method for choosing this step size. Furthermore, our protocol for sampling 
the protocol from public randomness is significantly different from that in [|8l and relies on randomness 
extraction techniques. In particular, the protocol in (§1 does not attain the asymptotically optimal bounds 
achieved by our protocol. 

Technical approach. While we utilize new, bespoke techniques for deriving our lower and upper 
bounds, casting our problem in an information theoretic framework allows us to build upon the develop¬ 
ments in this classic field. In particular, we rely on the information spectrum approach of Han and Verdu, 
introduced in the seminal paper ll23l (see the textbook lf22l for a detailed account). In this approach, the 
classical measures of information such as entropy and mutual information are viewed as expectations 
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of the corresponding information densities, and the notion of “typical sets” is replaced by sets where 
these information densities are bounded uniformly. The distribution of an information density (such as 
h(x) = — log P x (x)), or the support of this distribution, is loosely referred to as its spectrum. Further, 
we shall refer to the difference between max and min value of h(x) over its support as the length of 
the spectrum. Coding theorems of classical information theory consider IID repetitions and rely on the 
so-called the asymptotic equipartition property l(T2ft which essentially corresponds to the concentration 
of spectrums on small intervals. For single-shot problems such concentrations are not available and we 
have to work with the whole span of the spectrum. 

Our main technical contribution in this paper is the extension of the information spectrum method to 
handle interactive communication. Our results rely on the analysis of appropriately chosen information 
densities and, in particular, rely on the spectrum of the information complexity density ic(II;Jf, Y). 
Different components of our analysis require bounds on these information densities in different directions, 
which in turn renders our bounds loose and incurs a gap equal to the length of the corresponding 
information spectrum. To overcome this shortcoming, we use the spectrum slicing technique of Flan 
|[22lQ to divide the information spectrum into small portions with information densities closely bounded 
from both sides. While in our upper bounds spectrum slicing is used to carefully choose the parameters 
of the protocol, it is required in our lower bounds to identify a set of inputs where a given simulation 
will require a large number of bits to be communicated. 


C. Organization 

A formal statement of the problem along with the necessary preliminaries is given in the next section. 
Section[In]contains all our results. In Section [TV] we review the information theoretic secret key agreement 
problem, the leftover hash lemma, and the data exchange problem, all of which will be instrumental in 
our proofs. The formal proof of our lower bound is contained in Section [V] and that of our upper bound 


in Section VI Section VII contains a proof of our asymptotic results, followed by concluding remarks 
in Section IVIIII 


D. Notations 

Random variables are denoted by capital letters such as X, Y , etc. realizations by small letters such 
as x, y, etc. and their range sets by corresponding calligraphic letters such as X, y, etc.. Protocols are 

7 The spectrum slicing technique was introduced in (22) to derive the error exponents of various problems for general sources 
and a rate-distortion function for general sources. 
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denoted by appropriate subscripts or superscripts with it, the corresponding random transcripts by the 
same sub- or superscripts with II; r is used as a placeholder for realizations of random transcripts. All 
the logarithms in this paper are to the base 2. 

The following convention, described for the entropy density, shall be used for all information densities 
used in this paper. We shall abbreviate the entropy density hp x (x) = — log Px {'■>') by h(x), when there 
is no confusion about Py. and the random variable h(X) corresponds to drawing X from the distribution 

P.Y • 

Whenever there is no confusion, we will not display the dependence of distributional communication 
complexity on the underlying distribution; the latter remains fixed in most of our discussion. 

II. Problem Statement 

Two parties observe correlated random variables X and Y, with Party 1 observing X and Party 
2 observing Y, generated from a fixed distribution Pyy and taking values in finite sets X and y, 
respectively. An interactive protocol tt (for these two parties) consists of shared public randomness U, 
private randomnes^Jl/y and Uy, and interactive communication Hi, II 2 ,..., II r . The parties communicate 
alternatively with Party 1 transmitting in the odd rounds and Party 2 in the even rounds. Specifically, in 
each round i one of the party, say Party 1, communicates and transmits a string of bits II, e {0,1}* deter¬ 
mined by the previous transmissions III,..., Ip_i and the observations ( X, Ux, U) of the communicating 
party. To each possible value of the bit string Ip, a state from the state space {C, f) is associated. If the 
next state is C, the other party starts communicating. If it is q i>, the protocol stops and each party generates 
an output based on its local observation and trascript II 1 ..... IT, of the protocol. We assume without loss 
of generality that Party 1 initiates the protocol. Note that the set C, of possible values of II,, and the 
associated next states C or f for each value, is determined by a common function of (X, Ux, U, IP -1 ) 
and (Y, Uy. U. IP ~ 1 ) (c/. fl9ll ). i.e., as a function of a random variable V such that 

H(V\X, Ux, U, IT" 1 ) = H(V\Y, Uy, U, IT” 1 ) = 0. 

We denote the overall transcript of the protocol by II. The length of a protocol 7 r, |7r|, is the maximum 
number of bits that are communicated in any execution of the protocol. 

In the special case where Cj is a prefix-free set determined by IP -1 , the protocol is called a tree- 
protocol {cf. | [53l . lf29l l. In this case, the set of transcripts of the protocol can be represented by a 
tree, termed the protocol tree, with each leaf corresponding to a particular realization of the transcript. 

8 The random variables U, Ux, Uy are mutually independent and independent jointly of (A, Y). 
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Specifically, the protocol is defined by a binary tree where each internal node v is owned by either party, 
and node v is labeled either by a function a v : X x Ux x U —> {0,1} or b v : y x Uy x U —> {0,1}. 
Then each leaf, or the path from the root to the leaf, corresponds to the overall transcript. Our proposed 
protocol is indeed a tree protocol. On the other hand, our converse bound applies to the more general 
class of interactive protocols described above. 

A random variable F is said to be recoverable by 7r for Party 1 (or Party 2) if F is function of 

(X,U,U x , n) (or (Y, U, U y , II)). 

A protocol with a constant U is called a private coin protocol, with a constant {Ux, Uy) is called a 
public coin protocol, and with (U, Ux, Uy) constant is called a deterministic protocol. Note that a private 
coin protocol can be realized as a public coin protocol by sampling private coins from public coins. 

When we execute the protocol 7r above, the overall view of the parties consists of random variables 
(XFIin), where the two IIs correspond to the transcript of the protocol seen by the two parties. A 
simulation of the protocol consists of another protocol which generates almost the same view as that 
of the original protocol. We are interested in the simulation of private coin protocols, using arbitrary]^] 
protocols; public coin protocols can be simulated by simulating for each fixed value of public randomness 
the resulting private coin protocol. 

Definition 2 (e-Simulation of a protocol). Let 7r be a private coin protocol. Given 0 < e < 1, a protocol 
7Tsim constitutes an e-simulation of 7r if there exist Ux and ITy, respectively, recoverable by 7r sim for 
Party 1 and Party 2 such that 

d V ar (Pnnxr, Pn^n^xy) < e, (1) 

where d var (P, Q) = \ ')T. r |P X — Q x | denotes the variational or the statistical distance between P and Q. 

Definition 3 (Distributional communication complexity). The e-error distributional communication 
complexity D e (tt\Pxy) of simulating a private coin protocol 7r is the minimum length of an e-simulation 
of 7 r. The distribution P xy remains fixed throughout our analysis; for brevity, we shall abbreviate 

D e (7r|Pxy) by D e (tt). 

Problem. Given a protocol 7r and a joint distribution P xy for the observations of the two parties, we 
seek to characterize D e (7r). 

9 Since we are not interested in minimizing the amount of shared randomness used in a simulation, we allow arbitrary public 
coin protocols to be used as simulation protocols. 
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Remark 1 (Deterministic protocols). Note that a deterministic protocol corresponds to an interactive 
function. A specific instance of this situation appears in ||49l where II(2f, Y) = ( X , Y) is considered. 
For such protocols, 

d var (Pnnxy, Pn^n y A'y) = 1 — Pr (II = U x = Fly). 

Therefore, a protocol is an e-simulation of a deterministic protocol if and only if it computes the 
corresponding interactive function with probability of error less than e. Furthermore, randomization does 
not help in this case, and it suffices to use deterministic simulation protocols. Thus, our results below 
provide tight bounds for distributional communication complexity of interactive functions and even of 
all functions which are information theoretically securely computable for the distribution P xy, since 
computing these functions is tantamount to computing an interactive function ||36l (see, also, 0 , my 

Remark 2 (Compression of protocols). A protocol 7r com constitutes an e-compression of a given protocol 
7 r if it recovers Hx and Fly; for Party 1 and Party 2 such that 

Pr (II = IF^ = Fly;) > 1 — e. 

Note that randomization does not help in this case either. In fact, for deterministic protocols simulation and 
compression coincide. In general, however, compression is a more demanding task than simulation and 
our results show that in many cases, (such as the amortized regime), compression requires strictly more 
communication than simulation. Specifically, our results for e-simulation in this paper can be modified to 
get corresponding results for e-compression by replacing the information complexity density ic(r; x. y) 
by 

h(r\x) + h(r\y ) = - log P n |x (r\x) P n |y (r| y). 

The proofs remain essentially the same and, in fact, simplify significantly. 

III. Main Results 

We derive a lower bound for D e (tv) which applies to all private coin protocols 7r and, in fact, applies 
to the more general problem of communication complexity of sampling a correlated random variable. For 
protocols with bounded number of rounds of interaction, admittedly a significant restriction, i.e., protocols 
with r = r(X, Y, U. Ux, Uy) < r max with probability 1, we present a simulation protocol which yields 
upper bounds for D e (tv) of similar form as our lower bounds. In particular, in the asymptotic regime 
our bounds improve over previously known bounds and are tight. 
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A. Lower bound 

We prove the following lower bound. 

Theorem 1. Given 0 < e < 1 and a protocol n, for arbitrary 0 < p < 1/3 

D e (vr) > sup{A : Pr (ic(II; X, Y) > A) > e + e'} - A', (2) 

where the fudge parameters e' and X' depend on ij as well as appropriately chosen information spectrums 
and will be described below in 0 and 0. 

The appearance of fudge parameters such as e’ and A' in the bound above is typical since the techniques 
to bound the tail probability of random variables invariably entail such parameters, which are tuned based 
on the specific scenario being studied. For instance, the Chernoff bound has a parameter that is tuned 
with respect to the moment generating function of the random variable of interest. More relevant to 
the problem studied here, such fudge parameters also show up in the evalutation of error probability of 
single-party non-interactive compression problems (c/. Il23l . ll22l ). 

When the fudge parameters s! and X' are negligible, the right-side of the bound above is close to the 
e-tail of ic(II;X, Y). Indeed, the fudge parameters turn out to be negligible in many cases of interest. 
For instance, for the amortized case s’ can be chosen to be arbitrarily small. The parameter X! is related 
to the length of the interval in which the underlying information densities lie with probability greater than 
1 — s ', the essential length of spectrums. For the amortized case with product protocols, by the central 
limit theorem the related essential spectrums are of length A = O(yfn) and A' = log A. On the other 
hand, A £ is O(n). Thus, the logn order fudge parameter X' is negligible in this case. The same is true 
also for the example protocol in Appendix [A] Finally, it should be noted that similar fudge parameters 
are ubiquitous in single-shot bounds; for instance, see Il22l Lemma 1.3.2], 

Remark 3. The result above does not rely on the interactive nature of II and is valid for simulation of 
any random variable II. Specifically, for any joint distribution Pnxy, an ^-simulation satisfying 0 must 
communicate at least as many bits as the right-side of 0, which is roughly equal to the largest value 
A £ of A such that Pr (ic(II; X, Y) > A) > e. 

The fudge parameters. The fudge parameters e' and X' in Theorem [T] depend on the spectrums of 
the following information densities: 

(i) Information complexity density: This density is described in Definition [T] and will play a pivotal 
role in our results. 

(ii) Entropy density of (X,Y): This density, given by h(X,Y) = — log P,yy (X. Y), captures the 
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randomness in the data and plays a fundamental role in the compression of the collective data of 
the two parties (cf. l[22l ). 

(iii) Conditional entropy density of X given Y\ l: The conditional entropy density h(X\Y) = — log P x\Y (A/F) 
plays a fundamental role in the compression of X for an observer of Y ||34ft, 11221 . We shall use 

the conditional entropy density h{X\YYl) in our bounds. 

(iv) Sum conditional entropy density of (XII, Y\\): The sum conditional entropy density is given by 
h (XAY) = — logP y|y (X\Y) P y\x (Y I AT) has been shown recently to play a fundamental role in 
the communication complexity of the data exchange problem Il49l . We shall use the sum conditional 
entropy density /i(AT[Ayn). 

Hpf 

(v) Information density of X and Y is given by i(X A Y) = h(X) — h(X\Y). 

Let [^min• AmL], [A ^ , AmL], and [A ^- n , Amix] denote the “essential” spectrums of information densities 
Ci = h(X,y '), C 2 = h(X\YU), and Cs = /) (ATIAPII), respectively. Concretely, let the tail events 
Si = {( i £ [A^ n , Amax]}5 * = 1,2, 3, satisfy 

Pr(^i)+ Pr(^ 2 )+ Pr(^3) < etaii, (3) 

where e t aii can be chosen to be appropriately small. Further, let A* = Amax — ^mL’ 1 = ! 5 2 ! 3 > denote 
the corresponding effective spectrum lengths. The parameters s' and A' in Theorem [I] are given by 

= Etaii + 2?7 (4) 

and 

A 7 = 21ogAiA 3 + logA 2 - log(l - 3p) + 91ogl /p + 3, (5) 

where 0 < rj < 1/3 is arbitrary. If A* = 0, i = 1,2,3, we can replace it with 1 in the bound above. 

Thus, our spectrum slicing approach allows us to reduce the dependence of A 7 on spectrum lengths Aj’s 
from lineal' to logarithmic. 

B. Upper bound 

We prove the following upper bound. 

Theorem 2. For every 0 < e < 1 and every protocol 7 r, 

D e (vr) < inf {A : Pr (ic(II; X, Y) > A) < e — e 7 } + A 7 , 
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where the fudge parameters e' and X' depend on the maximum number of rounds of interaction in vr and 
on appropriately chosen information spectrums. 

Remark 4. In contrast to the lower bound given in the previous section, the upper bound above relies 
on the interactive nature of it. Furthermore, the fudge parameters s' and A' depend on the number of 
rounds, and the upper bound may not be useful when the number of rounds is not negligible compared 
to the e-tail of the information complexity density. However, we will see that the above upper bound is 
tight for the amortized regime, even up to the second-order asymptotic term. 

The simulation protocol. Our simulation protocol simulates the given protocol 7r round-by-round, 
starting from Hi to n r . Simulation of each round consists of two subroutines: Interactive Slepian-Wolf 
compression and message reduction by public randomness. 

The first subroutine uses an interactive version of the classical Slepian-Wolf compression l45ll (see lf34l 
for a single-shot version) for sending X to an observer of Y. The standard (noninteractive) Slepian-Wolf 
coding entails hashing X to l values and sending the hash values to the observer of Y. The number of 
hash values l is chosen to take into account the worst-case performance of the protocol. However, we 
are not interested in the worst-case performance of each round, but of the overall multiround protocol. 
As such, we seek to compress X using the least possible instantaneous rate. To that end, we increase the 
number of hash values gradually, A at a time, until the receiver decodes X and sends back an ACK. We 
apply this subroutine to each round i, say i odd, with 11, in the role of X and (Y. Hi...., IT, i) in the role 
of Y. Similar interactive Slepian-Wolf compression schemes have been considered earlier in different 
contexts (c/. [El, OH. G21, 1221, ll49l ). 

The second subroutine reduces the number of bits communicated in the first by realizing a portion 
of the required communication by the shared public randomness U. Specifically, instead of transmitting 
hash values of n ? , we transmit hash values of a random variable II, generated in such a manner that some 
of its corresponding hash bits can be extracted from U and the overall joint distributions do not change 
by much. Since U is independent of (X , Y), the number k of hash bits that can be realized using public 
randomness is the maximum number of random hash bits of IT, that can be made almost independent of 
(. X , Y), a good bound for which is given by the leftover hash lemma. The overall simulation protocol 
for H; now communicates l — k instead of l bits. A similar technique for message reduction appears in 
a different context in m, ei, ii. 

The overall performance of the protocol above is still suboptimal because the saving of k bits is limited 
by the worst-case performance. To remedy this shortcoming, we once again take recourse to spectrum 
slicing to ensure that our saving k is close to the best possible for each realization (n,A', Y). 
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Note that our protocol above is closely related to that proposed in |[S]|. However, the information 
theoretic form here makes it amenable to techniques such as spectrum slicing, which leads to tighter 
bounds than those established in ®. 

The fudge parameters. The fudge parameters e' and A' in Theorem [2] depend on the spectrum of 
various conditional information densities. To optimize the performance of each subroutine described 
above, we slice the spectrum of the respective conditional information density involved. Specifically, 
for odd round t, we slice the spectrum of h(Jlt\YY[ t ~ 1 ) = — log Pnqyn*- 1 (H/JF. n t_ 1 ) for interactive 
Slepian-Wolf compression and /i(n t | Xn*” 1 ) = — logPnqxn*- 1 (n^A', n t_1 ) for the substitution of 
message by public randomness; for even rounds, the role of X and Y is interchanged. Each round 
involves some residuals related to the two conditional information densities. The fudge parameters e 7 and 
A' are accumulations of the residuals of each round. The explicit expressions for e 7 and A 7 are rather 


technical and are given in Section VI-E along with the proofs. 


C. Amortized regime: second-order asymptotics 

It was shown in ® that information complexity of a protocol equals the amortized communication 
rate for simulating the protocol, i.e., 

lim lim — D £ (ir n \P\ Y ) — TC(7r), 

£—>0 n—>oo n 

where P yy denotes the n-fold product of the distribution P xy, namely the distribution of random 
variables (A,. Y)f = i drawn IID from P xy, and 7r n corresponds to running the same protocol tt on every 
coordinate (X u Yi ). Thus, IC(7r) is the first-order term (coefficient of n) in the communication complexity 
of simulating the n-fold product of the protocol. However, the analysis in [|8]| sheds no light on finer 
asymptotics such as the second-order term or the dependence of /A(7r r, |P’ yy ) oij^je. On the one hand, 
it even remains unclear from ® if a positive e reduces the amortized communication rate or not. On the 
other hand, the amortized communication rate yields only a loose bound for I) £ (iT ri \P r YY ) for a finite, 
fixed n. A better estimate of D £ (jt n \P\ Y ) at a finite n and for a fixed e can be obtained by identifying 
the second-order asymptotic term. Such second-order asymptotics were first considered in li46l and have 
received a lot of attention in information theory in recent years following lf24l . |39l . 

Our lower bound in Theorem [T] and upper bound in Theorem [2] show that the leading term in 
D £ (n n \P^ Y ) i s roughly the e-tail A e of the random variable ic(n n ; X n ,Y n ) = Y^i= i i c (ITi; Aj, Yi), a 
sum of n IID random variables. By the central limit theorem the first-order asymptotic term in A e equals 

l0 The lower bound in H gives only the weak converse which holds only when e = e n — 0 as n — > oo. 


February 1, 2016 


DRAFT 



15 


riE [ic(II;X, y)] = nIC(7r), recovering the result of I0. Furthermore, the second-order asymptotic term 
depends on the variance V(7r) of ic(II;X, Y), i.e., on 

V(7r) = Var [ic(II; X, Y")] . 


We have the following result. 

Theorem 3. For every 0 < e < 1 and every protocol n with V(t r) > 0, 

D £ (ir n \P'x Y ) = nIC(7r) + y/nV(Tt)Q~ 1 (e) + o{y/n), 
where Q(x) is equal to the probability that a standard normal random variable exceeds x. 

As a corollary, we obtain the strong converse. 

Corollary 4. For every 0 < e < 1, the amortized communication rate 

lim -D £ (n n \F n XY ) = IC(tt). 

n —^oo 71 

Corollary [4] implies that the amortized communication complexity of simulating protocol n cannot 
be smaller than its information complexity even if we allow a positive error. Thus, if the length of the 
simulation protocol 7r sim is “much smaller” than nIC(7r), the corresponding simulation error e = e n must 
approach 1. But how fast does this e n converge to 1? Our next result shows that this convergence is 
exponentially rapid in n. 

Theorem 5. Given a protocol n and an arbitrary S > 0, for any simulation protocol 7r sim with 

|vr S im| < n[lC(7r) - 5 ], 

there exists a constant E = E{5) > 0 such that for every n sufficiently large , it holds that 

d v ar (Pn™n™X”y",Pnjri5X"Y") > 1 — 2~ £n . 

A similar converse was first shown for the channel coding problem by Arimoto 0 (see fThl l. l(40l for 
further refinements of this result), and has been studied for other classical information theory problems 
as well. To the best of our knowledge, Corollary [5] is the first instance of an Arimoto converse for a 
problem involving interactive communication. 

In the theoretical computer science literature, such converse results have been termed direct product 
theorems and have been considered in the context of the (distributional) communication complexity 
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problem (for computing a given function) (9j], liTTI . l26ll . Our lower bound in Theorem [I] too, yields 
a direct product theorem for the communication complexity problem. We state this simple result in the 
passing, skipping the details since they closely mimic Theorem [5] Specifically, given a function / on 
X x y, by a slight abuse of notations and terminologies, let D £ (f) = D s (f\Pxv) be the communication 
complexity of computing /. As noted in Remark |3j Theorem [T] is valid for an arbitrary random variables 
II, and not just an interactive protocol. Then, by following the proof of Theorem [5] with F = f(X,Y ) 
replacing II in the application of Theorem [I] we get the following direct product theorem. 

Theorem 6. Given a function f and an arbitrary 5 > 0, for any function computation protocol tt 
computing estimates Fx.n and Fy >n of f n at the Party 1 and Party 2, respectively, and with length 

\n\<n[H{F\X) + H(F\Y) -5], (6) 

there exists a constant E = F(6) >0 such that for every n sufficiently large, it holds that 

Pr(F x , n = Fy :n = F n )<2 ~ En , 

where F n := (Fi ,..., F n ) and F t := f(Xi, Yf), 1 < i < n. 

Recall that (81, (311 showed that the first order asymptotic term in the amortized communication 
complexity for function computation equals the information complexity IC(/) of the function, namely 
the infimum over IC(7r) for all interactive protocols -k that recover / with 0 error. Ideally, we would like 
to show an Arimoto converse for this problem, i.e., replace the threshold on the right-side of ([6]) with 
n[lC(/) — <5]. The direct product result above is weaker than such an Arimoto converse, and proving the 
Arimoto converse for the function computation problem is work in progress. Nevertheless, the simple 
result above is not comparable with the known direct product theorems in (3, (TO and can be stronger 
in some regime^] 

D. General formula for amortized communication complexity 

Consider arbitrary distributions P x„Y n on X n x y n and arbitrary protocols n n with inputs X n and Y n 
taking values in X n and y n , for each n G N. For vanishing simulation error e n , how does D £n (7r n |Px„y„) 
evolve as a function of n? 

The previous section, and much of the theoretical computer science literature, has focused on the 
case when P x n Y n = P \y anc ^ l bc same protocol 7r is executed on each coordinate. In this section, 

"The result in IE G2 shows a direct product theorem when we communicate less than nIC(/)/poly(logn). 
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we identify the first-order asymptotic term in D £n (TT n \P x,,Y n ) for a general sequence of distribution^] 
{Px„y„} ? T=i and a general sequence of protocols 7r = Un}n=V Formally, the amortized (distributional) 
communication complexity of 7r for {Px n Y n }^=i is given h\p^] 

D(n) = f lim limsup -D e (n n \Px n Y„)- 

£—■ > 0 rr—>-oc 71 

Our goal is to characterize D(n) for any given sequences P n and 7r. We seek a general formula for 
D( 7r) under minimal assumptions. Since we do not make any assumptions on the underlying distribution, 
we cannot use any measure concentration results. Instead, we take recourse to probability limits of infor¬ 
mation spectrums introduced by Han and Verdu in lf23l for handling this situation (c/. ll22l l. Specifically, 
for a sequence of protocols 7r = {Xn}n=1 and a sequence of observations (X, Y) = {(X n ,Y n )}^ =1 , the 
sup information complexity is defined as 

IC(tt) l = f inf < a I lim Pr ( — ic(n n ; X n , Y n ) > a ) = 0 

[ n—>oo yn J 

where, with a slight abuse of notation, II„ is the transcript of protocol 7r n for observations ( X n . Y n ). The 
result below shows that it is nIC(-7r), and not IC(7r n ), that determines the communication complexity in 
general. 

Theorem 7. For every sequence of protocols 7V = {^n}f = i, 

D{ 7r) = IC(tt). 

The proof uses Theorem [T] and Theorem |2] with carefully chosen spectrum-slice sizes. 

For the case when 7r n = 7r n and P x n Y n = P xy> follows from the law of large numbers that 
IC(tt) = IC(vr) and we recover the result of (S). However, the utility of the general formula goes beyond 
this simple amortized regime. Example [I] provides one such instance. In this case, IC(-7r) can be easily 
shown to equal IC(7r h ) for any bias of the coin no- 

IV. Background: Secret Key Agreement and Data Exchange 

Our proofs draw from various techniques in cryptography and information theory. In particular, we use 
our recent results on information theoretic secret key agreement and data exchange, which are reviewed 
in this section together with the requisite background. 

l2 We do not require Px„y„ to be even consistent. 

13 Although D( 7r) also depends on (Px„y„}5^Li, we omit the dependency in our notation. 
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A. Secret key agreement by public discussion 

The problem of two party secret key agreement by public discussion was alluded to in 0, but a 
proper formulation and an asymptotically optimal construction appeared first in f33ll . ffl . Consider two 
parties with the first and the second party, respectively, observing the random variable X and Y. Using 
an interactive protocol ir and their local observations, the parties agree on a secret key. A random variable 
K constitutes a secret key if the two parties form estimates that agree with K with probability close to 1 
and K is concealed, in effect, from an eavesdropper with access to the transcript II and a side-information 
Z. Formally, let Kx and Ky, respectively, be recoverable by it for the first and the second party. Such 
random variables Kx and Ky with common range 1C constitute an e-secret key if the following condition 
is satisfied: 

d v ar (^KxKyUZ^PuJif X ?nz) < £, 

where 

The condition above ensures both reliable recovery, requiring Pr (Kx f Ky) to be small, and information 
theoretic secrecy, requiring the distribution of Kx (or Ky) to be almost independent of the eavesdropper’s 
side information (II, Z) and to be almost uniform. See 11501 for a discussion. 

Definition 4. Given 0 < e < 1, the supremum over lengths log |/C| of an e-secret key is denoted by 
S e (X,Y\Z), and for the case when Z is constant by S e (X,Y). 

By its definition, S e (X,Y\Z) has the following monotonicity property. 

Lemma 8 (Monotonicity). For any deterministic protocol n, 

S £ (X,Y\Z) > S e (XU,YU\ZU). 

Furthermore, if Vx and Vy can be recovered by it for the first and the second party, respectively, then 

Ss(X,Y\Z) >S E (XVx,V y \ZU). 

The claim holds since the two parties can generate a secret key by first running n and then generating 
a secret key for the case when the first party observes (X, II), the second party observes (Y, II) and 
the eavesdropper observes (Z, II). Similarly, the second inequality holds since the parties can ignore a 
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portion of their observations and generate a secret key from (X, Vx) and (Y, Vy). 

1) Leftover hash lemma: A key tool for generating secret keys is the leftover hash lemma ( cf. 11421 ) 
which, given a random variable X and an /-bit eavesdropper’s observation Z, allows us to extract roughly 
^min(Px) — l bits of uniform bits, independent of Z. We shall use a slightly more general form. Given 
random variables X and Z, let 

tt rr> | \ def 1 Pxz (x, Z ) 

H m in[PxZ |Q Z) = SUp -log . 

x,z Q Z \Z ) 

We define the conditional min-entropy of X given Z as 

H mm (P xz | Z) = f sup H min (P.yz | Q z) ■ 

Qz ■ supp(P z ) C supp(Q z ) 

Further, let T be a 2-universal family of mappings / : X —> X, i.e., for each x' f x, the family T 
satisfies 

T £>(/(*) = /(*'))< A 

Lemma 9 (Leftover Hash). Consider random variables X, Z and V taking values in countable sets X, 
Z, and a finite set V, respectively. Let S be a random seed such that fs is uniformly distributed over a 
2-universal family T. Then, for K$ = fs(X) 

lEs {d var (Pk s vz, PunifPyz)} < - yj |/C| | y\2~ H ^X^xz \ z ) ) 

where P un if is the uniform distribution on 1C. 

The version above is a straightforward modification of the leftover hash lemma in, for instance, ll42l 
and can be derived in a similar manner. 

As an application of the leftover hash lemma above, we get the following useful result. 

Lemma 10. Consider random variables X, Y. Z and V taking values in countable sets X, y, Z, and a 
finite set V, respectively. Then, 

S 2e (X,Y\ZV ) > S e (X, Y\Z) — log |V| - 21og(l/2e). 

The proof is relegated to Appendix [A] 

2 ) Conditional independence testing upper bound for secret key lengths: Next, we recall the conditional 
independence testing upper bound for S e {X, Y), which was established in If50l . | |5T1 . In fact, the general 
upper bound in lt50l . ll5H is a single-shot upper bound on the secret key length for a multiparty secret 
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key agreement problem with side information at the eavesdropper. Below, we recall a specialization of 
the general result for the two party case with no side information at the eavesdropper. In order to state 
the result, we need the following concept from binary hypothesis testing. 

Consider a binary hypothesis testing problem with null hypothesis P and alternative hypothesis Q, 
where P and Q are distributions on the same alphabet V. Upon observing a value v G V, the observer 
needs to decide if the value was generated by the distribution P or the distribution Q. To this end, the 
observer applies a stochastic test T, which is a conditional distribution on {0,1} given an observation 
v £ V. When v G V is observed, the test T chooses the null hypothesis with probability T(0|r;) and the 
alternative hypothesis with probability T(l|'u) = 1 — T(0|v). For 0 < s < 1, denote by /U(P. Q) the 
infimum of the probability of error of type II given that the probability of error of type I is less than e, 
i.e.. 


Pe( P,Q):= inf Q[T], 

T: P[T]>1—e 

where 

pm = 

V 

Q[T] = 

V 

The following upper bound for S e (X,Y) was established in l50il . Oil . 

Theorem 11 (Conditional independence testing bound). Given 0 < £ < 1, 0 <r/< 1—e, the following 
bound holds: 


S e (X, Y) < — log Pe+n (Pxy j QjvQv) T 2 log( 1 / 77 )7 
for all distributions 0 \ and Qy on X and y, respectively. 

We close by noting a further upper bound for j3 e ( P, Q), which is easy to derive. 

Lemma 12. For every 0 < e < 1 and A, 

-log/3 e (P,Q) < A-log(V (log^j^y < A ) - £ ) > 

where (x) + = max{0, x}. As a corollary, we obtain the following upper bound for S e (X,Y): 

S e (X,Y)< A - log (Pr (log < a) - £ - ,) + + 2 log(l/„), 
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for all distributions Qx and Q y. 


fi. The data exchange problem 

The next primitive that will be used in the reduction argument in our lower bound proof is a protocol 
for data exchange. The parties observing X and Y seek to know each other’s data. What is the minimum 
length of interactive communication required? This basic problem, first studied in f37ll . is in effect a 
two-party extension of the classical Slepian-Wolf compression |[45ll (see lfT4l for a multiparty version). 
In a recent work l49ll . we derived tight lower and upper bounds for the length of a protocol that, for a 
given distribution Pxy, will facilitate data exchange with probability of error less than e. We review the 
proposed protocol and its performance here; first, we formally define the data exchange problem. 

Definition 5. For 0 < e < 1, a protocol n attains e-data exchange if there exist Y and X which are 
recoverable by n for the first and the second party, respectively, and satisfy 

P(X = X, Y = Y)> 1 - e. 


Note that data exchange corresponds to simulating a (deterministic) interactive protocol n where 
IIi(X) = X and II 2 = Y ; attaining e-data exchange is tantamount to e-simulation of n. In fact, the 
specific protocol for data exchange proposed in ||49l can be recovered as a special case of our simulation 
protocol in Section [VTJ The next result paraphrases |49l Theorem 2] and can also be recovered as a 


special case of Lemma 21 


We paraphrase the result form [49] in a form that is more suited for our application here. The 
data exchange protocol proposed in ll49l relies on slicing the spectrum of h{X\Y) (or h(Y\X)). Let 
T ta ii denote the tail event h(X\Y) f [A' nin . AJ^J. The protocol entails slicing the essential spectrum 
[A^n, A' max ] into N parts of length A each, i.e., 

jy _ A mm / 'min 


Theorem 13 ( 11491 Theorem 2], Lemma |21[ ). Given A > 0,£ > 0, and N as above, there exists a 
deterministic protocol for e-data exchange satisfying the following properties: 

(i) Denoting by £ err or the error event, it holds that 


Pxy terror D {h(XAY) < A}) < Pxy (£taii) + N2-S, 
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which further yields that the probability of error e is bounded above as 

e < P XY (h(XAY) > A) + P XY (ftaii) + N2~*; 

(ii) the protocol communicates no more than A + A + N + £ bits; 

(iii) for every ( X , Y) such that Aj nin < h{X\Y) < A / rn , lx , the transcript of the protocol can take no more 
than 2 h ( XAY )+ A +? values. 


Note that property (iii) above, though not explicitly stated in ll49l Theorem 2] or in the general 


Lemma 21 below, follows simply from the proofs of these results. It makes the subtle observation that 
while, for each (. X,Y ) such that A ^ < h(X\Y) < A( nax , /i(AAT) + A + A r + ^ bits are communicated 
to interactively generate the transcript, the number of (variable length) transcripts is no more tharp] 
h(XAY) + A + N + £. Property (ii) above was crucial to establish the communication complexity 
results of |49ll : property (iii) was not relevant in the context of that work. On the other hand, here we 
shall use the protocol of Theorem [13] in our reduction to secret key agreement in the next section and will 
treat the communication used in data exchange as eavesdropper’s side information. As such, it suffices 
to bound the number of values taken by the transcript; the number of bits actually communicated in the 
interactive protocol is a loose upper bound on the former quantity. 


Interestingly, our simulation protocol given in Section VI is used both in our upper bound to compress 
a given protocol and in our lower bound to complete the reduction argument. 


V. Proof of Lower Bound 

As described in the introduction, our proof of Theorem [T] relies on generating a secret key for X and Y 
from a given e-simulation 7r sim of ir. However, there are two caveats in the heuristic approach described 
in the introduction: 

First, to extract secret keys from the generated common randomness we rely on the leftover hash lemma. 
In particular, the bits are extracted by applying a 2-universal hash family to the common randomness 
generated. However, the range-size of the hash family must be selected based on the min-entropy of the 
generated common randomness, which is not easy to estimate. To remedy this, we communicate more 
using a data-exchange protocol proposed in ll49t to make the collective observations ( X , Y) available 
to both the parties; a good bound for the communication complexity of this protocol is available. The 
generated common randomness now includes (X, Y) for which the min-entropy can be easily bounded 

l4 The JV-bit ACK-NACK feedback used in the protocol can be determined from the length of the transcript. 
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and the size of the aforementioned extracted secret key can be tracked. A similar common randomness 
completion and decomposition technique was introduced in |48l to characterize a class of securely 
computable functions. 

Second, our methodology described above requires bounds on various information densities in different 
directions. A direct application of this method will result in a gap equal to the effective length of various 
spectrums involved. To remedy this, we apply the methodology described above not to the original 
distribution Pxy but a conditional distribution Fxy\£ where the event £ is an appropriately chosen 
event contained in single slices of various spectrums involved. Such a conditioning is allowed since we 
are interested in the worst-case communication complexity of the simulation protocol. 

We now describe the proof of Theorem [T] in detail. To make the exposition clear, we have divided the 
proof into live steps. 

Given a (private coin) protocol tt, let 7r sim be its e-simulation and Hx and Ily be the corresponding 
estimates of the transcript II for Party 1 and Party 2, respectively. 

A. From simulation to probability of error 

We first use a coupling argument to replace the e-simulation condition with an e probability of error 
condition. Recall the maximal coupling lemma. 

Lemma 14 (Maximal Coupling Lemma | |47l ). For any two distributions P and Q on the same set, 
there exists a joint distribution P xy with X ~ P and Y ~ Q such that 

Pr {X f Y) = d var (P, Q). 

Using the maximal coupling lemma, for each fixed x. y there exists a joint distribution Pnn A n y |.Y=x,Y=j/ 
such that 


Pr (n — lix — IPylX — X,Y — y) — 1 - d var (Pnn|A'=:r,Y=y, Pn*IIy|.Y=:r,Y=y) ; 

Consequently, 

Pr (n = Ux = Ily) = 1 - y^PxY (x,y) dvar (Pmi| A=o:,Y=y> Pithily \X=x,Y=y) 

x,y 

= 1 - d var (PnnxY, PiIa-iwyy) 

> 1 - £■ (7) 

As pointed in footnote [9j we restrict ourselves to public coin protocols 7r sim using shared public random- 
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ness U. For concreteness (and convenience of proof), we define the joint distribution for (XlXXx^XyXYU) 
as 


Pn^iLyii xyu = Pn^n^nxyPc/in^nyXy- (8) 

Note that the marginal Pn*n y XYU remains as in the original protocol. In particular, (A, Y ) is jointly 
independent of U. 


B. From partial knowledge to omniscience 

As explained in the heuristic proof above, instead of extracting a secret key from the common 
randomness generated by the protocol vr sim , we first use the data exchange protocol of Theorem 13 
to make all the data available to both the parties, which was termed attaining omniscienc^ \ in lH4ll . 
In particular, the parties run the protocol 7r sim followed by a data exchange protocol for (ATI, FFI) to 
recover (. X , Y) at both the parties. Once both the parties have access to ( X , Y), they can extract a secret 
key from ( X , Y) which will be used in the reduction in our final step. 

Formally, with the notations introduced in Section |IV-B[ let 7 t de be the data exchange protocol of 


Theorem 13 with X and Y replaced by (All) and (FIT), respectively, with X - 2 and A 2 denoting N and 


A, respectively, and with A = A max , ^ min - ~ min , ~ max 


- x(3) , AA,, = A^ n , A' max = Amax- Then, denoting by £ error the error 


event for the protocol 7 t de Theorem |73}i) yields 

Pr (terror (T £%) < Pr (£ 2 ) + A 2 2“«, 


(9) 


where £2 and £:> are as in (|3]). Furthermore, for every realization (A, Y) ^ £3 the number possible 
transcripts II DE is no more than 


2 ft(xnArn)+A 2 +e 


( 10 ) 


We seek to use 7 t de for recovering Y and A, respectively, at Party 1 and Party 2 by running 7 t de 
successively after vr sim . However, 7r sim yields AI I,y and Vd ly at Pai ty 1 and Party 2, respectively, while 
the data exchange protocol 7 t de facilitates data exchange when the two parties observe An and Fn. We 
can easily fix this gap using <(7]). 

Specifically, denote by A and Y the estimates of A and Y formed at Party 2 and Paity 1 in 7 t de . Note 


l5 Csiszar and Narayan considered a multiterminal version of the data exchange problem in Ql and connected the minimum 
(amortized) rate of communication needed to the maximum (amortized) secret key rate. 
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that 7r DE is a deterministic protocol and X and Y are functions of (X. Y, I [. FI). Denote by A the set 

A = {( T X ,Ty,T 1 X , y) ; T X = Ty = t} 


and by B the set 

B = {( Tx,Ty,r,x,y) : X(x,y,r,r) = x,Y(x,y,r,r) = y}, 

which is the same as £g rror for £ error in Q. Then, by ((7]) and Q 

Pr ({x(x, y, u x ,Uy) = x , Y(X, y, u x , u y ) = Y}n £f) 

> Pn^n^nxy (^nBn £f) 

> Pn^rtyiixY (-4) + Pr (5g) - Pn x nj,n.Yy {B c n £'[) - 1 

> 1 - £ - Pr (f 2 ) - Pr (£ 3 ) - N 2 2~t. (11) 


C. From simulation to secret keys: A rough sketch of the reduction 

The first step in our proof is to replace the simulation condition ([T) with the probability of error 
condition ([7]) for the joint distribution Pn A .n y n.YYiy in ([8]). 

Next, we “complete the common randomness,” i.e., we communicate more to facilitate the recovery 
of y and X at Party 1 and Party 2, respectively. To that end, upon executing 7r sim , the parties run the 


data exchange protocol 7 t de of Theorem 13 for (XII) and (yn), with (X, H x ) and (y, ITy) in place of 
(in) and (yn), respectively. Condition (J7]) guarantees that the combined protocol (7r S i m ,7r D E) recovers 
y and X at Party 1 and Party 2 with probability of error less than e. 

We now sketch our reduction argument. Consider the secret key agreement for X and Y when the 
eavesdropper observes U. By the independence of (X,Y) and U, S ri (XU. YU\U) = S V (X,Y), and 
further, the result of li50l shows that S V (X,Y ) is bounded above, roughly, by the mutual information 
density i(X A Y) = log P X Y (X, Y) /P x (X) Py (Y), i.e., 


S V (XU, YU\U) = S V (X, Y) < i(X A y). 


( 12 ) 


On the other hand, we can generate a secret key using the following protocol: 

1) Run the combined protocol (7r sim , 7 t de ) to attain data exchange for X and Y, resulting in a common 
randomness of size roughly h(X,Y\U) = h(X,Y). 

2) The data exchange protocol 7 t de for (ATI) and (XII) communicates roughly h (XIIAyn) bits for 
every fixed realization (X, Y, II). Thus, the combined protocol (vr sim , 7r DE ), which allows both the 
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parties to recover (X, Y), communicates no more than 17r sim | + h (ATIIATTI) bits for every fixed 
realization {X,Y, II). Using the leftover hash lemma, we can extract a secret key of rate roughly 
h(x,Y) - |7r sim | - /i(xnAyn), 

The following approximate inequalities summarize our reduction: 

s v (xu,yu\u) > ^(xf,xy|n slm n DE c/) 

> S V (XY,XY\U) - |7r sim | - h(XUAYU) 

« h(X, Y) - |7r sim | - h (XnAUn), (13) 

where the first inequality is by Lemma [8] and the the second by Lemma [9] 

We note that the generation of secret keys from data exchange was first proposed in ffl4ll in an amortized, 
IID setup and was shown to yield a secret key of asymptotically optimal rate. 

From ( fT2| ) and ( fT3| ) it follows that 

|7T sim | > h(x, y) — h (xnAFn) - i(x a y) = ic(n ; x, y), 

which is the required lower bound. 

Clearly, the steps above are not precise. We have used instantaneous communication and common ran¬ 
domness lengths in our bounds whereas a formal treatment will require us to use worst-case performance 
bounds for these quantities. Unfortunately, such worst-case bounds do not yield our desired lower bound 
for D e ( 7 r). To fill this gap, we apply the arguments above not for the original distribution Pn*n y iLxn/ 
but for the conditional distribution Pn*n 3 ,iLX’yrr|£ where the event £ is carefully constructed in such a 
manner that the aforementioned worst-case bounds are close to instantaneous bounds for all realizations. 
Specifically, £ is selected by appropriately slicing the spectrums of the various information densities that 
appear in the worst-case bounds. 


D. From original to conditional probabilities: A Spectrum slicing argument 

To identify an appropriate critical event for conditioning, we take recourse to spectrum slicing. Specif¬ 
ically, we identify an appropriate subset of intersection of slices of spectrums (ii) and (iv) described in 
Section 


III-A 


For the combined protocol (7r sim , 7 t de ) and the estimates (X,Y) as above, and A[* in , A max , 


i = 1, 2, 3, as in Section III-A[ let 
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£\ = {ic(II; X, Y) > A} 

£? ] = {a£L + (i - 1) Ai < h(X, Y ) < A^ n + iAi}, 1 < i < iVi, 

£f ] = {Ail + ti - 1) As < h (XHAFn) < Ail + 3^}, l<j<N 3 , 


where 


JVi = 


Note that Uj 1' = £\ and U j £^ 
event £ t j as follows: 


\ l 1 ) 

■A max 

A 


- f c 

— c s> 


X d) x l 3 ) _ x ( 3 ) 

lAnin aM Na = Amax ^ 

i a 3 

where the events £\ and £ 3 are as in (j3]). Finally, define the 


Sn = Ssim n 4 e n^n ff } n £f\ 1 < i < n u 1 < j < n 3 . 


The next lemma says that (at least) one of the events £ t] has significant probability, and this particular 
event will be used as the critical event in our proofs. 


Lemma 15. There exists i,j such that 


Pr (£ij) > 


Pr (£\) - e - £ tai i - N 2 2 € def 
iViiV 3 


(14) 


Proof. Note that the event £ sim n £-m fi £ 3 is the same as the event A 0 B H £'i of GD- Therefore, 


Pr (£ sim n £ de n£ x n £{ n £f) > Pr (£ x ) + Pr (£ S im n £ D e n £f) + Pr {£{) - 2 

> Pr (£ x ) - e - Pr (£ 2 ) - Pr (f 3 ) - N 2 - Pr (£) 

> Pr (£ x ) -£- Stall - N 2 2 _? , 


where the second inequality uses GD and and the third uses ©• The proof is completed upon noting 
that {£ij}i,j constitutes a partition of £ sim n £m H £\ n £f n with N) ]V 3 parts. ■ 


E. From simulation to secret keys: The formal reduction proof 

We are now in a position to complete the proof of our lower bound. For brevity, let £ denote the event 
£ij of Lemma 15 satisfying Pr (£) > a. 


Our proof essentially formalizes the steps outlined in Section V-C but for the conditional distribution 
given £. With an abuse of notation, let S ri (X. Y\Z.£) denote the maximum length of an //-secret key 
for two parties observing X and Y, and the eavesdropper’s side information Z, when the distribution 


of ( X, Y, Z ) is given by Pxyz\£- Then, using Lemma 12 with Qx = Px and Q y = Py, we get the 
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following bound in place of ( fl2| ): 

S 2 ,(X, Y\£) < 7 - log (pr ({(i, y) : log 1 


^ -3?? +21og(l/??) 


< 7 - log ( Pr ( <J (x,y) : log < 7 + log a 


£)-3j7J + 21 og(l/7?), 

(15) 


where 0<^<l/3is arbitrary and in the previous inequality we have used 

r, r_ ICA / Pxy(x,y) p xy(x,y) 

P,vr, £ (*,y\e) < Pr(£) < —-—• 


To replace ( fT3| ), note that by Lemma [8] 


S2ti(x,y\£) > ^^(^fiisimiiDE, yn sim ii DE |c/, n sim , n DE ,£) 

>S 2 v (XY,XY\U,U simi U m ,£). (16) 

Next, note that by ( fT7)j ) the transcript n sim lI DE takes no more than 2 k=i»l+M^ nAy n)+A 2 +£ values for 
every realization (X,Y) (ji £ 3 . However, when the event £ = £ rJ holds, h (ATIAV'I I) < A^; n + . 7 A 3 . It 
follows by Lemma |T0] that 

S 2v (XY,XY\UU sin U DE ,£) 

> S V (XY , XY\U, £) - |vr sim | - A^ n - jA 3 - A 2 - £ - 2 log(l/2 rj). (17) 

Also, since {X = X.Y = Y} holds when we condition on £, 

S V (XY,XY\U,£) = Sr,(XY,XY\U,£) 

> XYU\£ I U) — 21og(l/2 rj), (18) 


where the previous inequality is by the leftover hash lemma. Furthermore, by using 


^ / x , P XYu(x,y,u) P X Yu(x,y,u ) 

PxYu\e{x,y,u) < --—r^:-< 


Pr (£) 


a 


we can 


bound H m\„ (P\-vrr\f \ U) as follows: 


rr rn 1 rns ■ , PxYU\s(x,y,u) 

H m in[P xyu\£ I u ) > mm-log- - 

rjj (u) 


> min — log 

x,y,u 


Pxyu (x,y,u)l (P X YU\£ {x,y,u)> 0) 
aP v (u) 
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= min hp XY (x, y) + log a 

x,yeS, (1) 

> A2L + (i-l)Ai + loga. (19) 

Thus, on combining ([T6|)-([T9|), we get 

S 2 v (X, Y\£) > [A^ n + (i - l)Ai - A^ n - jA 3 + logo] - A 2 - £ -41og(l/27/) - | 7 r siffi |. (20) 
To get a matching form of the upper bound ( |T3| ) for S 2 ri (X, Y\£), note that sincep’j 

- ic Pn xA T \ x ,y) = *p XY ( x ^y) - hp XY (x,y) + h PnxY ((x, r) A - (y,r)), 


and since under <5 


< A min + *Ai, 

fcp*vn((®>'0A(j/,r)) > A rnL + U - !) A 3, 


it holds that 


Pr {(x,y) : i PxY (x Ay) < 7 +log a} 


£ 


> Pr ^ j(x, y,r) : 

On choosing 


-icp. 


Xx,y,r) < 7 — A mi n — iAi + A® n + (J — 1) A 3 + loga j s'j 


7 = -A + A^ n + *Ai - A^ n - (j - 1)A 3 - log a, 


(3) 

min 


it follows from (15 1 that 


S 2 v (X,Y\£) 

< -A + [A^ n + iAi - A^ n - (j - 1)A 3 - log a] - log (Pr («f A | £) - 3r/) + + 2 log(l/? ? ) 


< -A + [A^ n + iAi - Aj;A J n - (j - 1 ) A 3 - log a] - log(l - 3 77 ) + 21og(l /y), 

where the equality holds since Pr (£\ \ £) = 1. 

Thus, by ( | 20 | ) and ( f2T| ), we get 


A3) 


( 21 ) 


Ksiml > A + 2 log a - Ai - A 2 - A 3 - £ - 6 log(l /rj) + log(l - 3r/) + 4 


l( ’For clarity, we display the dependence of each information density on the underlying distribution in the remainder of this 
section. 
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= A + 2 log(Pr (£ x ) - £ - £ tai i — 77) — 2 log Ag N :i - (Ai + A 2 + A 3 ) - log N 2 

- 7log(l/ 77 ) + log(l - 3 rj) + 4. 

where the equality holds for £ = — log rj + log Ag>. Note that the maximum value of the right-side above, 
when maximized over N t and A,; under the constraint AgA t = A,;, i = 1, 2, 3, occurs for Ai = A 3 = 2 
and A 2 = 1. Substituting this choice of parameters, we get 

Kiml > A + 21og(Pr (£x) - £ - Stall - rj) - 21og A 1 A 3 - logA 2 - 71og(l/?7) + log(l - 3 rj) + 3. 

> A - 2 log A 1 A 3 - log A 2 - 9 log(l/ 77 ) + log(l - 3 rj) + 3. 

where the final inequality holds for every A such that Pr (£a) > £ + £taii + 2//; Theorem [I] follows upon 
maximizing the right side-over all such A. ■ 

VI. Simulation Protocol and the Upper Bound 

In this section, we formally present an £-simulation of a given interactive protocol vr with bounded 
rounds. For clarity, we build the simulation protocol in steps. 

A. Sending X using one-sided communication 

We start with the well-known Slepian-Wolf compression problem ll45l where Party 1 wants to transmit 
X itself to Party 2 using as few bits as possible. This corresponds to simulating the deterministic protocol 
II = III = X. See Remark [T] in Section [II] for a discussion on simulation of deterministic protocols. 

For encoder, we use a hash function that is randomly chosen from a 2-universal hash family Ti(X)\ 
for decoder, we use a kind of joint typical decoder lfl2l . Let the typical set Tp x v be given by 

7 pxiv = {Ouy) : h Px\ Y ( x \y) < 1 - 7} (22) 

for a slack parameter 7 > 0. Our first protocol is given below: 

The following result is from ll34l . ll22l Lemma 7.2.1] (see, also, lf30ll ). 

Lemma 16 (Performance of Protocol [lj. For every 7 > 0, the protocol above satisfies 

Pr(x#x)<P,w (7g |y )+2-X 

Essentially, the result above says that Party 1 can send X to Party 2 with probability of error less than 
£ using roughly as many bits as the £-tail of hp x] ^(X\Y). 
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Protocol 1: Slepian-Wolf compression 

Input: Observations X and Y, uniform public randomness t4 as h> and a parameter l 

Output: Estimate X of X at party 2 

Both parties use C/hash to select / from J~i(X) 

Party 1 sends n simj i = f(X) 

if Party 2 finds a unique x E Tp x]Y with hash value f ( x ) = 1 I s j m | then 
! set X = x 

else 

|_ protocol declares an error 


In fact, the use of the typical set in (22) is not crucial in Protocol [T] and its performance analysis: For a 
given measure Qxy, we can define another typical set 7q v|v . by replacing hp x . Y (x\y) with hq x]Y {x\y) 


in (22 1 even though the underlying distribution of (X. Y) is P xy- Then, the error probability is bounded 
as 


Pr(x^X)<P x y (r4„.) +2" 1 ', 

which implies that X can be sent by using roughly as many bits as the e-tail of h(y x[Y {X\Y) under Pxy- 
This modification simplifies our performance analysis of the more involved protocols in the following 
sections. 


B. Sending X using interactive communication 

Protocol [I] aims at minimizing the worst-case communication length over all realization of (X, Y). 
However, our goal here is to simulate a multiround interactive protocol, and we need not account for the 
worst-case communication length in each round. Instead, we shall optimize the worst-case communication 
length for the combined interactive protocol. The protocol below is a modification of Protocol [T] and uses 
roughly h(X\Y) bits for transmitting X instead of its e-tail. 

The new protocol proceeds as the previous one but relies on spectrum-slicing to adapt the length of 
communication to the specific realization of (X,Y): It increases the size of the hash output gradually, 
starting with Ai = A mm and increasing the size A-bits at a time until either Party 2 decodes X or A max 
bits have been sent. After each transmission, Party 2 sends either an ACK-NACK feedback signal. The 
protocol stops when an ACK symbol is received. 

Fix an auxiliary distribution Qyy- For A A 1,1 . AA iax . Aq y v > 0 with AA ax > AA" n , let 


Nt 


Qx|i 


\ max \ min 

A Qx\y A Qx\y 


A 


Q 


X\Y 
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and 


^qL,v _ + (® _ i < * < Nq X \ Y . 


Further, let 


T qI y '■= {( x >v) I h Qx\ Y ( x \y) ^ X Qx* y or h Qx^(x\ y) < A^v} > (23) 


(l) 

and for 1 < i < Nq x{y , let TA denote the ith slice of the spectrum given by 


ndf) 

Qx\y 


{( x >y) I a q x ,v ^ h ^xA x \y) < x Q XiY + a q xi ^} • 


Note that Tq °! . coiTesponds to TA in the previous section and will be counted as an error event. 


Protocol 2: Interactive Slepian-Wolf compression 
Input: Observations X and Y with distribution P xy, uniform public randomness C4ash> auxiliary 
distribution Q xy, and parameters 7 , Aq™ y , Aq y|y , Nq x[y , and l 
Output: Estimate X of X at party 2 
Both parties use f4 as h to select f\ from XfiX) 

Party 1 sends n sim .i = fi(X) 

if Party 2 finds a unique x 6 7q^ with hash value f\ (x) = n s j m i then 
set X = x 

send back n s j m 2 = ACK 


e 


Ise 


j send back n s j m 2 = NACK 


while 2 < i < Nq x y and party 2 did not send an ACK do 

Both parties use C/hash to select f from T 7 a q (A'), independent of f \,..., /,_ 1 
Party 1 sends n simi2i _i = ffiX) 

if Party 2 finds a unique x € 7q y with hash value fj(x) = II s j m oj_i, VI < j < i then 
set X = x 


send back II 


sim,2i 


= ACK 


se 


if More than one such x found then 
protocol declares an error 

else 

send back II s j mj 2 i = NACK 


Reset i -y i + 1 


if No X found at party 2 then 
L Protocol declares an error 


Our protocol is described in Protocol 


1 


For every (x,y) € 7q ? y|v , 1 < i < Nq x[y , the following 


lemma provides a bound on the error. 
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Lemma 17 (Performance of Protocol j2j». For (x, y) e 7q x|v , 1 < < < A’q v|v ,, denoting by X = X{x, y ) 
f/u? estimate of x at Party 2 at the end of the protocol (with the convention that X = 0 if an error is 
declared), Protocol [2] sends at most ( l + (i — 1)Aq x|1 , + i) bits and has probability of error bounded 
above as follows: 

Pr (x^x\X = x, Y = y) < i2 X ^w +A ^w~ l . 


Proof: Since (x, y) G 7q x v . , an error occurs if there exists a x f x such that (x, y ) G and 

n s im,2fc-i = f‘ 2 k— i (T) for 1 < k < j for some j < i. Therefore, the probability of error is bounded 
above as 


Pr (X ± x | X = x, Y = yj < ^ ^ Pr {f 2k - i(x) = f 2 k-i{x), V1 < k < j) 1 ((x, y) G 7^ 

J=1 X^X 


O') 

X| V 


< 


EE 

j=l x^x 
i 

EE 


2^+0 t)^Qx|r 


1 Ner; 


(?) 

Qx|> 


1 


i)Aq 


;/ 1 

< *2 AS ^ +AQ * |y “\ 


X|A 


{*1 (*>y) g 7 Q*V} 


where the first inequality follows from the union bound, the second inequality follows from the property 
of 2 -universal hash family, and the third inequality follows from the fact that 

\{x\(x,y)eT£lj\<2<r +AQ *'Y 


Note that the protocol sends l bits in the first transmission, and Aq x|v bits and 1-bit feedback in every 
subsequence transmission. Therefore, no more than (l + (i — 1)Aq x|v + i) bits are sent. ■ 

Corollary 18. Protocol [ 2 ] with l = A^“ v +A Qx|x +7 sends at most (fi Qx|x (X|y) + A Qx|x + 7 +N Qx|x ) 
bits when the observations arJ^ | ( X , Y) (f Tq^ y , and has probability of error less than 

Pr (X fx)<Pr ((A, Y) G T^ y ) + N QxlY 2~X 


C. Simulation of ITi using interactive communication 

We now proceed to simulating the first round of our given interactive protocol 7 r. Note that using 
Protocol [2] we can send IIi using roughly /t(I I 1 1 Y) bits. This protocol uses a public randomness f/hash 

17 When h,Q xlY {X\Y) < Aq x|x , ProtocolWmay transmit more than (/iq x|x (X|T) + Aq x|x + 7 + A r Q x|y ) bits. 
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only to choose hash functions, which is convenient for our probability of error analysis, and can be easily 
derandomized. We now present a scheme which uses another independent portion of public randomness 
U s i m to reduce the rate of the communication further. However, the scheme will only allow the parties 
to simulate Hi (rather than recover it with small probability of error) and cannot be derandomized. 

Specifically, our next protocol uses X and U = (£4ash - fA s im) to simulate Hi in such a manner that 
(7 s j m can be treated, in effect, as a portion of the communication used in Protocol [2] Note that since 
U s \ m is independent of (X, V'). the portion of communication which is equivalent to U s \ m must as well 
be almost independent of (X,Y). Such a portion can be guaranteed by noting that the communication 
used in Protocol [2] is simply a random hash of Hi drawn from a 2-universal family, and therefore, its 
appropriately small portion can have the desired independence property by the leftover hash lemma. In 
fact, since the Markov condition Hi~e- X -e-Y holds, it suffices guarantee the independent of X instead 
of (X,Y). 


Protocol 3: Simulation of Hi 

Input: Observations X and Y with distribution P xy, uniform public randomness 

u = (Z 7 hash,E/sim), auxiliary distribution Qrqy, and parameters 7, Xq^ y , A Qni|y , Nq Ui]y 
and k 

Output: Estimates and niy of ni 

1. Two parties share k random bits U s \ m and an h chosen from T//.(:supp(TI 1 )) using t4 as h 

2. Party 1 generates a sample U IX using Pn^Xf^n,) {-\X,U S - Im ) 

3. Parties use Protocol |2j with auxiliary distribution Qrqy, and parameters 7, Aq 1 " 1 Aq I( v . , 
JVQ ni , y , and l = Ag™ |r + Aq II|| v . + 7 to send U 1X to Party 2 by treating £/ sim as the first k bits of 
communication obtained via the hash function / 


Our simulation protocol is described in Protocol |3j Let the quantities such as Aq” 1 |v , AQ ni|Y , and 


N Qiiuy be defined analogously to the corresponding quantities in Section VI-B with Hi replacing X. 
The following lemma provides a bound on the simulation error for Protocol [3] 

Lemma 19 (Performance of Protocol [3). Protocol [?] sends at most 

(^Qnjiy^lArl^) + AQ ni|v + + 7 - k) + 

bits when (n ix,Y) (f 7q° |v , and has simulation error 

d ,ar (Pn^n iy xy,Pn ini xy) < Pr {(Ih,Y) € T^) + JV Qni|y 2^ + ^2 fc -^(P ni x|Q x ) 

for any auxiliary distribution Qx on X. 
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Proof: Consider the following simple protocol for simulating II i at Party 2: 

1) Party 1 generates a sample IIi using P n i^ (-|X). 

2) Both parties use Protocol |2j with auxiliary distribution Qrpy, and parameters 7, Aq™ y , Aq,, Y , 
Nq U i[y , and l = Aq" 1 |v _ + Aq h |1 , + 7 to generate an estimate IIi of IIi at Party 2. 

In this protocol, Z wst = Aq“ |v + Nq Ui]y Aq Hi|v +7 bits of hash values will be sent for the worst (IIi, Y). 
We divide these Z wst hash values into two parts, the fist k bits and the last / wst — k bits; let / and f, 
respectively, denote the hash function producing the first and the second parts. Protocol [3] replaces, in 
effect, / with shared randomness U s \ m for an appropriately chosen value of k. 

Note that the joint distribution of the random variables involved in the simple protocol above satisfied 


P/(ni)/'(ni)nif[iXY( v ’ v > r > y) 

= P/(nox(w, x)P ni \xf(n 1 )(T\x, v)P r(Ul)lIh (v'\T)P Y \x{y\x)Pf llimi ) f '(n 1 )Ti lXY (. x \v , v> » r > x > v)- 

(24) 

Note that the simple protocol above is deterministic and therefore by Remark [T] 

dvar (P/(n 1 )/'(n 1 )n 1 ni.vy’^/(no/'inoniniA'y) = Pr 7 - Pi) 

< Pr ((DlA) e + JV Qnilr 2-\ (25) 


where the inequality is by Corollary [18] 

On the other hand, the joint distribution of random variables involved in Protocol [3] can be factorized 
as 


p u sim f'(n lx )n lx n iy XY( u , v! , r, t, x, y) 

= p u sim ( u ) p x(x)Pu 1 \xf(u 1 )(r\x,u)P f/{Ul) \ Ul {u'\T)P Y \x(y\x)Pf lilf{ni)f ^ ni)IliXY (f\u,u , ,T,x,y). 

(26) 

Therefore, the simulation error for Protocol [3] is bounded as 

d v ar (Pn^niyxy, PrqrtiXY) 

< d v ar (Pt/^/qnon^n^Ay^P/ino/qnoninxAy) 

— ^ var ( p t/sim/'(n 1 )n 1 ^niyAyi p /(n 1 )/'(n 1 )n 1 n 1 Ay) + ^ var (P/fTto/qnoniniXY’ p /(ni)/'(ni)ninixy) 
= d v ar ( p t/ sim p A, p /(ni)A') + c(var f(n 1 )p(n 1 )iL 1 fi 1 XY > p /(ni)/'(ni)n 1 n 1 xy) 

l8 When the protocol terminate before lVQ n | Y th round, a part of (/(IIi), /'(IIi)) may not be sent. 
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< d v ar (P^P x ,P /(ni )x) +Pr ((n U Y) G 7^) +N QuilY 2~\ 

where the first inequality is by the monotonicity of d var the second inequality is by the triangular 
inequality, the equality is by the fact that replacing P[/ sim Px with P/pqyv i s the only difference between 
the factorizations in ([26]) and ( |24[ ), and the final inequality is by ( |25] ). The desired bound on simulation 
error for Protocol [3] follows by using Lemma [9] to get 

d v ar (Pt4 im p x,P/(n!)x) < -\/2 A -' _H ““( Pn i x IQ x ). 

Since Protocol [3] uses shared randomness U s \ m instead of sending /(fli), it communicates k fewer bits 
in comparison with the simple protocol above, which completes the proof. ■ 


D. Improved simulation of Hi 

In Protocol[3]we were able to reduce the communication by roughly -H min fPrq y|Qy) bits by simulating 
a IIi such that if we use Protocol [2] for sending ITi to Party 2, a portion of the required communication 
can be treated as shared public randomness. However, this is the least reduction in communication we can 
obtain in the worst-case. In this section, we slice the spectrum of /ip , x (11 1 1X ) to obtain an instantaneous 
reduction of roughly hp ix (ni|X) bits. 

Denote by J a random variable which takes the value j e {o, l,...,x Pni|i ,} if (n l5 x) e r£ } i|x . in 
our modified protocol, Party 1 first samples J and sends it to Party 2. Then, they proceed with Protocol [3] 
for Pn, XY\j=j by selecting k to be less than /XiinfPiT, A'|./= ; , IQx) for an appropriately chosen Qx- Let 
J g be the set of ’’good” indices j > 0 with 

p j (j) > 5 

r n x |x 

it holds that 


p J (jf)<Pr((n,,ne^| |l ) + 

Note that for j G J g , with Qx = I’.v, we have 

^min(Prqxi j=j |Px) = min - iog 
= min — log 




Pn^vp ( T i x \j) 
Px (x) 
Pnqx (r \x) 


Pj ( 3 ) 

> A^ |x + (j - l)A Pni|x - 21ogX Pni|x . 
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Protocol 4: Improved simulation of II i 


Input: Observations X and Y with distribution Pxr> uniform public randomness 

U = (C/hash, U s \ m ), and parameters Ag^ |yS A Pni|x , JV Pni|y , Ag£ |Jt , A Pni|x , JV Pni|Jt , and 7 
Output: Estimates and II of IIi 
Party 1 generate J ~ PjixH-^O* and send it to Party 2. 

if J = j € J y then 

Parties use Protocol |3j with auxiliary distribution Pipy, parameters 7 , AJ ',' 1 ' 1 , A Pni|y , Np a ^ Y , 
and k = Ag 1 " 1 + (j - l)A Pni|x - 2 logiVp ni|x - 27 + 2 to simulate IIi* and 1% for the 


distribution Pn, XY\j=j 


se 


|_ protocol declares an error 


Our modified simulation protocol is described in Protocol [4] The following lemma provides a bound 
on the simulation error. 

Lemma 20 (Performance of Protocol |4j. Protocol [7] sends at most 

( /l Pn 1 |v( n tA'|^) - /ip ni|x (n 1 A >|X) + iVp ni|y + 31ogiVp ni|x + Ap ni|x + Ap ni|x + 37 ) + 
bits when ( IT 1 , Y) (f 7pj^ y . and has simulation error 
G^var (Pn^n^xr, P^^xy) 

< Pr ((n,,F) 6 T™ |r ) + Pr ((H,,X) € T™ |x ) + (« P » i|v + 1) m + 

Proof: First, we have 

C^var (Pn^n^xY, PriiiiiXY) 

^ C^var (Pn 1 A ,n iy xyj, Pn 1 n 1 XYj) 

= y^Pj(j)^var (Pn 1 A .n ly XY|J=i’PriillxA^'IJ^') 
j 

< X ar (Pn 1A -n iy AY|j=j)PniniXiv^) + Pj (^g) 

i&Js 

< X P-/0’) d var (Pn 1Af n iy XY|j=j>PnxnxXYij=j) +Pr ((IIi, X) <E 7g ( °j |x ) + 


^Pnp 


j£J t 


Then, we apply Lemma 19 with Qx = Px for each j € J g , and get 


dv ar (Pn 1 A .n 1 j,.YY|j=j ) Pn 1 n 1 XY|j=j) 
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< Pr ((ni,y) € Tp ( °| |v I J = j) +iV Pni|v 2 - 7 + I v / 2 fc -^"( Pn i*i^l p ^) 

< Pi- ((Hi ,Y) e T^ ilY I J = j) + (iV Pni|r + 1) 2 -7 . (27) 

Thus, we have the desired bound on simulation error. 

Next, we prove the claimed bound on the number of bits sent by the protocol. By Lemma [T9| the fact 
that J can be sent by using at most log iVp ni|X + 1 bits and the choice of k in Protocol |4j for J = j the 
protocol above communicates at most 

^Q ni |v(ni*|y) + ^Qnpr + NQh^y + 7 + logiVp ni|x + 2- k 

< ^ 1 Qn 1 |v(niA'|^) - ^Pn° x ~ (j - !)^P ni |x + + 3 log IVp ni |A - + 3 7- 

< /l Qn 1 | 5 '( n tA’| Y) - / l'Pn 1 |x( n iA’|^) + A P ni |x + A Q ni |x + iV Qn 1 |y + 31ogIV Pnipf + 3 7, 

where the previous inequality holds since for IIi^ generated by P ni |x/(ni)j('|^ ^sim, j) 

A Pn!|x +7'Ap ni|x > ^P ni | X (IIl^|X), 

for each j G J g . ■ 

E. Simulation of II 

We are now in a position to describe our complete simulation protocol. Consider an interactive protocol 
7r with maximum number of rounds r max = d < oo. We simply apply Protocol [4] for each round 11/ of 
II. Our overall simulation protocol is described in Protocol [5] In each round we use Protocol [4] assuming 
that the simulation up to the previous round has succeeded, where, for the rounds with even numbers, 
we use Protocol [4] by interchanging the role of Party 1 and Party 2. 


Protocol 5: Simulation of II 


Input: Observations X and Y with distribution P xy, uniform public randomness 

A 


U = {U tyhash , C4 sim : t = 1,..., d), and parameters 


Pn t |xn*-i’ ^P 


n t ixn * -15 


^mm 

Pro ivn *- 1 ’ 


A Pn,vn.-l. ^Pn.lvnt-l for t = 1, . . . , d UTld ^. 


npyn*- 1 ’ Jr n t |yn t 

Output: Estimates Ylx and Fly; of II 
while Total communication is less than Z max 


Party 1 and Party 2, respectively, use estimates 11 ^, 1 and 11^ 1 for 11* 1 


hits, and simulation not ended do 

sspectively, use esti 

Parties use Protocol jdj for simulating Pn t (xn t - 1 )(yn t - 1 ) with parameters A™ 

Np . lt Ap lin 

Pn t |xn‘-i 

Update t —> t + 1 


min 

n t ixn*-i ’ 


A P 


mixn *- 19 


, Ap , ,, Np , . and 7 ; 

9 r n t |i r n t - 1 ’ ^ntirn *- 1 1 9 


if Total communication exceeds / max bits then 
L Declare an error 
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The following lemma provides a bound on the simulation error. 


Lemma 21 (Performance of Protocol |5j. Protocol [5] sends at most ( max bits, and has simulation 
dva.r (Pn x n y XY,Pnnxy) 


error 


< Pr ^ic(n ; X, Y) + J2 > ^max j 

d r 

+ J2 4Pr ((n t , (y, u*- 1 )) € Tp^J + 4Pr ((n t , (x, n*- 1 )) e r P ( ; 

t=l L 

+ 3 + ^Pn^n.-! + 2 ) 2 “ 7 + Np 


(0) 

rui.xn *- 1 


+ Np 

ntlxn* -1 Jr n t |xn t - 1 


where 


Np . 3 log Np . , Ap . 1 H - Ap . , H- 37 odd t 

g _ 1 r ntlYnt - 1 ° r ixn*- 1 1 ntixn*- 1 1 ntlxn 4 - 1 1 

Np , , , + 3 log Np , . , + Ap , . . + Ap . , + 37 even t 

Jr n t ixn *- 1 1 o ^n^vii *- 1 1 1 n^-ixn *- 1 1 1 njyrp - 1 1 ' 

Remark 5. The fudge parameters s' and X' are given by 


t =1 L 


4Pr ((n t ,(yn*' 1 )) e T^J ivnt l ) + 4Pr [{U t , (X,n 4 - 1 )) e t' 

3 3 


£, =X 

t =l 

T- 3 ( Np , + Xp , + 2^) 2 ^ + 

V *n t |vn‘-i ^n t |xn‘-i J jy p 

a' = E *, 


(0) 

Pn, ixn 1 - 1 


+ 


n, txn 1 - 1 


Np 


n t |vn*- 


(28) 


t= l 

where St is given by 

Proof: Consider a virtual protocol which does not terminate even if the total number of bits exceed 
Imax. Denote the output of this protocol by tlx = (fiiy,..., fl^y) and fly = (fiiy,..., fl^y). We have 

G^var (Pn A -nj,xy, Pnnxy) 

< d var (Pn A n y XY,Pn A n y xy) + ^var (PflxflyXyi Pnnxy) 

< Pr ((Ily, Ily) f (fly, fly)) + dvar (Pn^-n^xy > Pnnxy) • (29) 


First, we bound the second term of ( |29| ). By using triangular inequality repeatedly and by using Lemma 
20 l we have 


dvar (Pn A n y xy>Pnnxy) 
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— ^ var ( ^rii^niy •■ii( < d-i)x^-{d-i)y^-dx^-dyXY ’ ^ > nini -n(d_i)ri( £ i_i)n 1 i^n t i3;Xy 


+ dvar (Priill!-- 


n(d-i)n(d_i)iidAfndyXy 


, Pnxn!- 


n( d _i)n (d _i)n d n d xy 


- d var (Pn 1 ^n 1 ,.-n (J _^n (i _ 1 ) ^y ! Pn 1 n 1 ...n (J _ 1 ) n (J _ 1) xy 
+ d v ar (Pn^nj^xnJ-^Yff-ijiPnjn^xff-^w- 1 )) 


a 

= ^^var (Pn^n^^n'-^cyff-ij.Pnin.txff-^cyn*- 1 )) 
t=i 

< E f Pr (( n *> n ' _1 )) G T pZrn<-i) + Pr G T P 0) 

t:odd ^ 

+ (% t ,yn«- 1 + 1 ) 2 " 7 + jv p 


n t |xn‘—i 


1 


n t |xn‘-i 
-( 0 ) 


t:even 

+ + 1 ) 2 ^ + 

d 


+ E Pr ((n f , (y, n*- 1 )) g Tp^E + Pr ((n t , (x n*- 1 )) g t p ( 

i 


(0) 

n t |xn*-i 


N P 


ruim *- 1 


< 


E 

t =l L 


Pr an*, (y,^- 1 )) G r p ^ j +Pr an*, (xn*- 1 )) g 


ixn *- 1 


+ (^Pntivnt-i + -^Pn.ixnt-i + 2 ) 2 7 + 


1 1 

+ 


Xp t , JVp 

^ 11 * 1 x 11 * -1 r n t |ynp 


Denote 


;(x,y,n^,n 3 ;) := E 

t:odd 

+ E ') - Ap n , lrn ,_,(n ff |y n^- 1 ). 


t :even 


(30) 


Since (IIa, IPy) coincides with (11a ; ILy) when the accumulated message length of the protocol generat¬ 
ing (FIa" fly) does not exceed Z max , and since the message length of each round is bounded by each term 


T' 


(0) 


of l{X,Y,n x ,Uy) plus 5 t by Lemma 20 unless (U tx , (Y, n^ 1 )) G 7^^, or (U ty ,(X,U^^) G 


, we have 


Pr^n^n^/feny)) 
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<Pr ^l(X,Y,n x ,Uy) + E 
+ Pr m (U tx , (Y, n^ 1 )) € T^ i1yii1 _, or (J (Sq,, (X, S** 1 )) G % 

\t:odd t:even 


( 0 ) 

Pro ixn *- 1 


( 31 ) 


Since 


Pr ((X, Y, U x , fly) e£)< Pr ((X, Y, n, n) € 5) + d v ar {Pn x n yXY , Puuxy) 
for any event £, it follows from ( |3T| ) that 

Pi((u x ,u y )^(n x ,n y )) 

<Pr ^(x,y,n,n) + E fit (max^ 

+ Pr m (n t , (y, n 4 - 1 )) e t p ( ° } vnt _ x or J (n t , (x, n*- 1 )) e % 

\£:odd 

+ 2d var (Pfi^nyXyjPnnxr) 

<Pr Ux,y, n,n) + E ^ ^max^ 


( 0 ) 

Pro ixn *- 1 


£:eve n 


+ E 

<=i 


t=i 
t 4—1 


Pr ^(n t , (y, n 4 - 1 )) e + Pr (jn t , (x, n*- 1 )) e 


tl-xn* 


+ 2d var (Pn^nyXYjPnnxy) • 


Thus, by combining this bound with (291 and (30 1 , and by noting 


z(x,y,n,n) = ic(n ; x > y), 


(32) 


we have the desired bound on simulation error. ■ 

VII. Asymptotic Optimality 

We now present the proofs of Theorem [3] and Theorem [7] Both the proofs rely on carefully choosing 
the slice-sizes in the lower and upper bounds. 
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A. Proof of Theorem [i] 

We start with the upper bound. Note that, for IID random variables (IP, X n . Y n ), the spectrums of 
h(nf\Z n ,(U t ~ 1 ) n ) fo0 Z = X or Y have width 0(y/n). Therefore, the parameters As and Ns that 
appear in the fudge parameters can be chosen as 0(n 1//4 ). Specifically, by standard measure concentration 
bounds (for bounded random variables), for every v > 0, there exists a constan 0c>O such that with 


Ap in tl =nH{U t \Z,U t ~ 1 ) -cy/n, 


, = nH(H t \Z, U f - l ) + cx/n. 


the following bound holds: 


p r ((n”, (z n , (n t_1 )”)) e 7p 


(o) 


< V. 


(33) 


Let T denote the third central moment of the random variable ic(II;X, Y). For 

= nIC(n) + (e ~ Mu - —A—) , 

choosing A Pn „ |zn(nt _ 1)ri = JV Pn „ |z „ (nl _ 1)n = 7 = V^cn 1 ^, and l max = \ n + fft=i s t in Theorem [ 2 ] (for 
the definition of the fudge parameters, see Remark [5), we get a protocol of length / rnax and satisfying 

d var (Pn"nj,A'™y",Pn"n"X"y") < Pr ic(IIi;AQ, Y^) > A n ^j + 9 du 

for sufficiently large n. By its definition given in (28), St = O(n 1 / 4 ) for the choice of parameters above. 
Thus, the Berry-Esseen theorem (c/. lfl8l ) and the observation above gives a protocol of length ( max 
attaining e-simulation. Therefore, using the Taylor approximation of Q(-) yields the achievability of the 
claimed protocol length. 

For the lower bound, we fix sufficiently small constant S > 0, and we set A ^ n = n(H(X,Y) — 8 ), 
A<JL = n{H(X,Y)+8), A^ n = n(H(X\Y,U)-8), a£L = n(H(X\Y,U)+6), A^ n = n(H(XUAYU)- 

(3) 

8 ), Amax = n{H(XUAYU) + J), respectively. Then, by standard measure concentration bounds imply 
that the tail probability e ta ii in ([3]) is bounded above by - for some constant c > 0. We also set p = A. 


19 We use this notation throughout this section to avoid repetition. 

20 Although the constant depends on random variables appearing in each round, since the number of rounds is bounded, we 
take the maximum constant so that (|33|) holds for every t. 
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For these choices of parameters, we note that the fudge parameter is A' = O(logn). Thus, by setting 

A = A„ = mow + (e + + 2 v( J 3 /2^ ) 

= nIC(7r) + yj nV (tt)Q ^ 1 (e) + O(logn), 

where the final equality is by the Tailor approximation, an application of the Berry-Esseen theorem to 
the bound in Q gives the desired lower bound on the protocol length. ■ 

B. Proof of Theorem [5] 

Theorem [T] implies that if a protocol 7r sim is such that 

log |vr sim | < A - A', (34) 

then its simulation error must be larger than 

Pr (ic (IF; X n , Y n ) > A) - s'. (35) 

To compute fudge parameters, we set A^; n = n(H(X,Y) — 5), Amax = n(H(X,Y ) + 5), A^- n = 
n(H(X\Y,U)-S), A^L = n(H(X\Y,U)+5), A^ n = n(H(XUAYU)- 6 ), A^L = n(H(XUAYU) + 
5), respectively. By the Chernoff bound, there exists E\ > 0 such that 

etaii < 2 ~ ElTl . 

Furthermore, A, = 0(n) for i = 1,2, 3. We set rj = 2~^ n . It follows that 

e' < 2 ~ E ^n _|_ (36) 

and 

A' < -n + 0(logn). (37) 

o 

Finally, upon setting 

A = raIC(7r) — - (38) 

o 

and applying the Chernoff bound once more, we obtain a constant E 2 > 0 such that 

Pr (ic (IF; X n , Y n ) > A) > 1 - 2~ E * n . (39) 
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The result follows upon combining (|34|)-(39). 


C. Proof of Theorem [7] 

For a sequence of protocols 7 r = {- 7 r ri }^ =1 and a sequence of observations (X, Y) = {(X n , Y^)}^ =1 , 
let 


H(U t \Z ,11* x ) = sup (a : lim Pr (/i(II n t |Z n II^ x ) < a) = o) , (40) 

l n—>oo v 7 ) 

H{n t \Z, n*- 1 ) = inf {a : lim Pr > a) = o) , (41) 

where Z = X or Y, II t = {n„ ! t}^L 1 and II ^ _1 = {II^ _ 1 })7= i are sequences of transcripts of fth round 
and up to tth rounds, respectively. For achievability part, we fix arbitrary small d > 0, and set 


\ min 
Ap 


n n ,t\z n u: 


\ max 
Ap 


n n t \z n n 


=n(H(^t\Z,U t - 1 )-5) , 
_ x =n(H(U t \Z,n t ~ 1 ) + 5), 


A P 


n„ + \z n u 


. , = Np . i = 7 = \/2 Sn. We set 

n n t \z n n.n 1 ' 


^max = n (ic(tt) + 6 ) + y~^ St 


t =1 


= n (IC(tt) + (5) + 0(y/n), 


where is given by ( |28| ). Then, by Theorem [5] by the definition of IC(7r) and by ( |40| ) and ( p4T| ), there 
exists a simulation protocol of length Z max with vanishing simulation error. Since 5 > 0 is arbitrary, we 
have the desired achievability bound. 

For converse part, we fix arbitrary <5 > 0, and set A^; n = n(iF(X, Y) — 5), Amax = n(H(X., Y) + <5), 
A^ n = n(H(X |Y,n) - 6 ), a£L = n(tf(X|Y,II) + 6 ), A^ n = n(F(XnAYn) - 5), a£L = 
n(Ff(XnAYFI) + (5), respectively, where 

Ff(X. Y) = sup { a : lim Pr ( h(X n Y n ) < a) = ol , 
l n—>00 J 

H(X, Y) = inf f a : lim Pr (h(X n Y n ) > a) = oj , 

l n—>00 J 

R(X | Y, n) = sup {a : Pr (h{X n \Y n U n ) < a) = 0} , 

H(x IY, n) = inf {a : Pr (h(X n \Y n U n ) > a) = 0} , 
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Then, by the definitions, we find that the tail probability e ta ii in ([3]) converges to 0. We also set // = (1/n). 
For these choices of parameters, we note that the fudge parameter is X' = O(logn). Thus, by using the 
bound in Q for 

A = A n = n (ic(tt) + 5) , (42) 

and by taking 5 —> 0, we have the desired converse bound. ■ 

VIII. Conclusion 

We have proposed a common randomness decomposition based approach (c/. f48ll ) to derive a lower 
bound on communication complexity of protocol simulation by relating the protocol simulation problem 
to the secret key agreement. A key step in our approach is identifying the amount of common randomness 
generated through protocol simulation. Our estimate for the amount of common randomness does not 
rely on the structure of the function to be computed. This is contrast to most of the existing lower bounds 
on communication complexity for function computation, such as the partition bound or the discrepancy 
bound, where the structure of the computed function plays an important role. In particular, a comparison 
of our approach with other existing approaches for specific functions is not available. An important future 
research agenda for us is to incorporate the structure of functions in our bound; the case of functions 
with a small range such as Boolean functions is of particular interest. 


Appendix 


To illustrate the utility of our lower bound, we consider a protocol tt which takes very few values most 
of the time, but with very small probability it can send many different transcripts. The proposed protocol 
can be e-simulated using very few bits of communication on average. But in the worst-case it requires as 
many bits of communication for e-simulation as needed for data exchange, for all e > 0 small enough. 

Specifically, let X = y = { I, 2"} and let 7r be a deterministic protocol such that the transcript 
r(x, y ) for (x, y) is given by 


T (x,y) = < 


a 

b 

c 

(x,y) 


if x> 62 n ,y> 52 n 
if x > 52 n , y < 62 n 
if x < 52 n , y > 62 n 
if x < 52 n ,y< 52 n 


for some small 6 > 0, which will be specified later. Clearly, this protocol is interactive. 
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Let ( X , Y) be the uniform random variables on X x y. Then, 


Pr(n 0 { a,b,c }) = 6 2 . 


Since 


Pn| x(r(x,y)\x) = < 


and similarly for P n |y (r(x,y)|y), we have 


1-6 if x > 62 n , y > 52 n 
6 if x> 62 n ,y <62 n 
1-6 if x <62 n ,y > 62 n 
2 '„ if x <62 n ,y <62 n 


i c(r(x,y)-,x,y) = < 


21og(l/(l — (5)) if x > 62''\y > 62 r 
log(l/5) + log(l/(l - 6 )) if x> 52 n ,y < 62 r 
log(l/5) + log(l/(l — 5)) if x < 62 n ,y > 62 r ‘ 
2 n if x < 62 n ,y < 62 r 


Consider 6 = and e = \. Note that for any A < 2 n. 


Pr (ic(n ; X, Y) > A) > Pr (n{o, b, c}) = 6 2 = > e. 


n* 


and 


Pr(ic(n ; X,r) > 2n) = 0. 


Thus, the e-tail of information complexity density A £ = sup{A : Pr (ic(II; X, Y ) > A) > e} is given by 


A e = 2 n. 


(43) 


On the other hand, we have 

IC(vr) = H(fl\X) + H(U\Y) 

< 26[h b (6) + logn - log(l/<5)] + 2(1 - 6 )h b ( 6 ) 

< 0 ( 6 2 ) 

where h b {-) is the binary entropy function. 

Also, to evaluate the lower bound of Theorem [I] we bound the fudge parameters in that bound. To that 
end, we fix e ta ii = 0 and bound the spectrum lengths Ai, A 2 , A 3 . Since (. X , Y) is uniform, h(X. Y) = 2 n 
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and so, Ai = 0. Also, note that with probability 1 the conditional entropy density /t(A|TI. Y) is either 0 
or log (h2"), which implies A 2 = O(n). A similar argument shows that A 3 = 0(n). Therefore, the fudge 
parameter 

A' = e>(logAiA 2 A 3 ) = O(logn), 


which in view of ( |43| ) and Theorem [I] gives D e (n) = Q(2n). 


Lemma. Consider random variables X, Y, Z and V taking values in countable sets X, y, Z, and a 
finite set V, respectively. Then, for every 0 < e < 1/2, 


S 2e (X,Y\ZV) > S £ (X, Y\Z) — log |V| - 21og(l/2e). 

Proof. Consider random variables I\' x and K'y with a common range JC such that ( K' x , K'y) constitutes 
an e-secret key for X and Y given eavesdropper’s observation Z, recoverable using an interactive protocol 
7 r . Let ClK' x K' y TVZV denote the distribution P „„/ f Pn'zv, where P/ nif denotes the distribution 

P£ ) f (fc^fcy) = 1(fc ^ fey) , V ky, ky £ K!. 

Then, by definition of an e-secret key, it holds that 

dvar (PK' x K^,n'Z, Q,K' x K^Wz) < £■ (44) 

Note that H m \ n (Q.wyw 7 \ WZ) > log /C'|. Therefore, by Lemma[9]there exists a function Kx = K(K' X ) 
taking values in a set /C with log \X\ > log |/C'| — log |V| — 21og(l/2e) such that 


d v ar (QA'.^n'zy, PunifQmzy) < e, (45) 

where P U nif denotes the uniform distribution on the set /C. Upon letting Ky = K(K'y ) and defining 
pj^ if analogously to P^ f with /C in place of K', we have 

dvar (? KxKyll' ZV zv) < ar (^Q,K x K y U'ZV, P^if Pn'Zyl +£ 

= d v ar (Q/OI'ZV, Punif Pn'Zy) + £ 

< 2 e, 

where the first inequality is by ( |44| ) and the second by ( |45| ), and the equality is by the definition of Q. 
Therefore, (Kx, Ky) constitutes a 2e-secret key of length log K/ — log |V| — 2 log(l/2e) for X and Y 
given eavesdropper’s observation (Z,V). The claimed bound follows since K' was an arbitrary secret 
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key for X and Y given eavesdropper’s observation Z. ■ 
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